A Concept Lattice-Based Kernel for SVM Text Classification
نویسندگان
چکیده
Standard Support Vector Machines (SVM) text classification relies on bag-of-words kernel to express the similarity between documents. We show that a document lattice can be used to define a valid kernel function that takes into account the relations between different terms. Such a kernel is based on the notion of conceptual proximity between pairs of terms, as encoded in the document lattice. We describe a method to perform SVM text classification with concept lattice-based kernel, which consists of text pre-processing, feature selection, lattice construction, computation of pairwise term similarity and kernel matrix, and SVM classification in the transformed feature space. We tested the accuracy of the proposed method on the 20NewsGroup database: the results show an improvement over the standard SVM when very little training data are available.
منابع مشابه
MULTI CLASS BRAIN TUMOR CLASSIFICATION OF MRI IMAGES USING HYBRID STRUCTURE DESCRIPTOR AND FUZZY LOGIC BASED RBF KERNEL SVM
Medical Image segmentation is to partition the image into a set of regions that are visually obvious and consistent with respect to some properties such as gray level, texture or color. Brain tumor classification is an imperative and difficult task in cancer radiotherapy. The objective of this research is to examine the use of pattern classification methods for distinguishing different types of...
متن کاملText Classification Using Support Vector Machine with Mixture of Kernel
Recent studies have revealed that emerging modern machine learning techniques are advantageous to statistical models for text classification, such as SVM. In this study, we discuss the applications of the support vector machine with mixture of kernel (SVM-MK) to design a text classification system. Differing from the standard SVM, the SVM-MK uses the 1-norm based object function and adopts the ...
متن کاملA corpus-based semantic kernel for text classification by using meaning values of terms
Text categorization plays a crucial role in both academic and commercial platforms due to the growing demand for automatic organization of documents. Kernel-based classification algorithms such as Support Vector Machines (SVM) have become highly popular in the task of text mining. This is mainly due to their relatively high classification accuracy on several application domains as well as their...
متن کاملSUBCLASS FUZZY-SVM CLASSIFIER AS AN EFFICIENT METHOD TO ENHANCE THE MASS DETECTION IN MAMMOGRAMS
This paper is concerned with the development of a novel classifier for automatic mass detection of mammograms, based on contourlet feature extraction in conjunction with statistical and fuzzy classifiers. In this method, mammograms are segmented into regions of interest (ROI) in order to extract features including geometrical and contourlet coefficients. The extracted features benefit from...
متن کاملSeparating Well Log Data to Train Support Vector Machines for Lithology Prediction in a Heterogeneous Carbonate Reservoir
The prediction of lithology is necessary in all areas of petroleum engineering. This means that to design a project in any branch of petroleum engineering, the lithology must be well known. Support vector machines (SVM’s) use an analytical approach to classification based on statistical learning theory, the principles of structural risk minimization, and empirical risk minimization. In this res...
متن کامل